Feature-Cost Sensitive Learning with Submodular Trees of Classifiers
نویسندگان
چکیده
During the past decade, machine learning algorithms have become commonplace in large-scale real-world industrial applications. In these settings, the computation time to train and test machine learning algorithms is a key consideration. At training-time the algorithms must scale to very large data set sizes. At testing-time, the cost of feature extraction can dominate the CPU runtime. Recently, a promising method was proposed to account for the feature extraction cost at testing time, called Cost-sensitive Tree of Classifiers (CSTC). Although the CSTC problem is NP-hard, the authors suggest an approximation through a mixed-norm relaxation across many classifiers. This relaxation is slow to train and requires involved optimization hyperparameter tuning. We propose a different relaxation using approximate submodularity, called Approximately Submodular Tree of Classifiers (ASTC). ASTC is much simpler to implement, yields equivalent results but requires no optimization hyperparameter tuning and is up to two orders of magnitude faster to train.
منابع مشابه
Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملA Submodular-supermodular Procedure with Applications to Discriminative Structure Learning
In this paper, we present an algorithm for minimizing the difference between two submodular functions using a variational framework which is based on (an extension of) the concave-convex procedure [17]. Because several commonly used metrics in machine learning, like mutual information and conditional mutual information, are submodular, the problem of minimizing the difference of two submodular ...
متن کاملComparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images
Abstract: Knowing the tree species combination of forests provides valuable information for studying the forest’s economic value, fire risk assessment, biodiversity monitoring, and wildlife habitat improvement. Fieldwork is often time-consuming and labor-required, free satellite data are available in coarse resolution and the use of manned aircraft is relatively costly. Recently, unmanned aeria...
متن کاملFeature Cost Sensitive Random Forest
In many applications, it is necessary to consider not only the predictive power of a machine learning model, but also its computational cost at test time. Here we explore greedy methods for feature cost sensitive random forest training. We also consider the case where different features share common subroutines or other dependencies such that computing one reduces the computational cost of othe...
متن کاملCoCoST: A Computational Cost Sensitive Classifier
Computational cost of classification is as important as accuracy in on-line classification systems. The computational cost is usually dominated by the cost of computing implicit features of the raw input data. Very few efforts have been made to design classifiers which perform effectively with limited computational power; instead, feature selection is usually employed as a pre-processing step t...
متن کامل